10 research outputs found

    ProSLAM: Graph SLAM from a Programmer's Perspective

    Full text link
    In this paper we present ProSLAM, a lightweight stereo visual SLAM system designed with simplicity in mind. Our work stems from the experience gathered by the authors while teaching SLAM to students and aims at providing a highly modular system that can be easily implemented and understood. Rather than focusing on the well known mathematical aspects of Stereo Visual SLAM, in this work we highlight the data structures and the algorithmic aspects that one needs to tackle during the design of such a system. We implemented ProSLAM using the C++ programming language in combination with a minimal set of well known used external libraries. In addition to an open source implementation, we provide several code snippets that address the core aspects of our approach directly in this paper. The results of a thorough validation performed on standard benchmark datasets show that our approach achieves accuracy comparable to state of the art methods, while requiring substantially less computational resources.Comment: 8 pages, 8 figure

    Bridging Between Computer and Robot Vision Through Data Augmentation: A Case Study on Object Recognition

    Get PDF
    Despite the impressive progress brought by deep network in visual object recognition, robot vision is still far from being a solved problem. The most successful convolutional architectures are developed starting from ImageNet, a large scale collection of images of object categories downloaded from the Web. This kind of images is very different from the situated and embodied visual experience of robots deployed in unconstrained settings. To reduce the gap between these two visual experiences, this paper proposes a simple yet effective data augmentation layer that zooms on the object of interest and simulates the object detection outcome of a robot vision system. The layer, that can be used with any convolutional deep architecture, brings to an increase in object recognition performance of up to 7{\%}, in experiments performed over three different benchmark databases. An implementation of our robot data augmentation layer has been made publicly available

    Lang3DSG: Language-based contrastive pre-training for 3D Scene Graph prediction

    Full text link
    D scene graphs are an emerging 3D scene representation, that models both the objects present in the scene as well as their relationships. However, learning 3D scene graphs is a challenging task because it requires not only object labels but also relationship annotations, which are very scarce in datasets. While it is widely accepted that pre-training is an effective approach to improve model performance in low data regimes, in this paper, we find that existing pre-training methods are ill-suited for 3D scene graphs. To solve this issue, we present the first language-based pre-training approach for 3D scene graphs, whereby we exploit the strong relationship between scene graphs and language. To this end, we leverage the language encoder of CLIP, a popular vision-language model, to distill its knowledge into our graph-based network. We formulate a contrastive pre-training, which aligns text embeddings of relationships (subject-predicate-object triplets) and predicted 3D graph features. Our method achieves state-of-the-art results on the main semantic 3D scene graph benchmark by showing improved effectiveness over pre-training baselines and outperforming all the existing fully supervised scene graph prediction methods by a significant margin. Furthermore, since our scene graph features are language-aligned, it allows us to query the language space of the features in a zero-shot manner. In this paper, we show an example of utilizing this property of the features to predict the room type of a scene without further training.Comment: 3DV 2024. Project page: https://kochsebastian.com/lang3ds

    Plug-and-Play SLAM: A Unified SLAM Architecture for Modularity and Ease of Use

    Get PDF
    Nowadays, SLAM (Simultaneous Localization and Mapping) is considered by the Robotics community to be a mature field. Currently, there are many open-source systems that are able to deliver fast and accurate estimation in typical real-world scenarios. Still, all these systems often provide an ad-hoc implementation that entailed to predefined sensor configurations. In this work, we tackle this issue, proposing a novel SLAM architecture specifically designed to address heterogeneous sensors' configuration and to standardize SLAM solutions. Thanks to its modularity and to specific design patterns, the presented architecture is easy to extend, enhancing code reuse and efficiency. Finally, adopting our solution, we conducted comparative experiments for a variety of sensor configurations, showing competitive results that confirm state-of-the-art performance

    Standardizing SLAM: exploiting recurrent patterns for modularity and behavioral robustness

    No full text
    Robots have become present in our everyday life. Robotic vacuum cleaners and lawnmowers take care of our homes, self-driving cars provide personal mobility in radically new ways, collaborative production assistants work side-by- side with humans in modern factories, and last-mile delivery platforms transport goods to their destination in intralogistics and urban spaces. These and many other applications have in common the need for an internal representation of the surrounding environment and require knowing the pose of the robot within this environment. In view of this, researchers during last decades invested substantial effort in finding solutions to this problem, converging in a field named Simultaneous Localization and Mapping (SLAM). In last years, the evolution of this field brought major breakthroughs that lead to structural changes in the core algorithms and the way the SLAM problem was framed. This dynamic evolution made it difficult to find a unified SLAM formulation that generalizes the different research lines pursued by various research laboratories around the world. However, nowadays the field reached a certain plateau, where all the state-of-the-art SLAM systems converged towards a graph-based formulation. We believe that it is time for standardization in SLAM and propose a unification approach that defines generalized SLAM interfaces, allowing for fast prototyping thanks to the interchangeability of the basic components developed from different authors. In addition to the architecture, we address the behavioral aspect of SLAM that plays an important role in the robustness of the system. Reasoning on a higher level of abstraction, above the mere geometric one, is key in robustly handling unforeseen events. In our approach, we create a behavioral control layer on top of a regular SLAM system, which guides the evolution of the SLAM system deciding the best task to accomplish according to external events, such as robot being lost, able to localize, and so on. In this thesis, we address these problems by proposing our novel approaches and improvements, derived from a careful analysis of the state-of-the-art, spotting, and avoiding their weaknesses while investigating how to combine their strengths. More in details, we developed (i) a standardized architecture for multi-sensor SLAM system able to cope with arbitrary robot setups, providing also two fully configurable and working pipelines, and (ii) a behavioral controller for SLAM systems, capable of handling unforeseen events, choosing the best next action to accomplish when needed. These contributions further advance SLAM towards a mature research field as they provide a generalized view of the problem formulation and system designs. They also have a significant practical impact. Unlike state-of-the-art systems, the considered modal aspects of SLAM are shown to play a key role in robustly dealing with situations that robots face when deployed autonomously in open-world environments

    Better Lost in Transition Than Lost in Space: SLAM State Machine

    No full text
    A Simultaneous Localization and Mapping(SLAM) system is a complex program consisting of several interconnected components with different functionalities such as optimization, tracking or loop detection. Whereas the literature addresses in detail how enhancing the algorithmic aspects ofthe individual components improves SLAM performance, the modal aspects, such as when to localize, relocalize or close a loop, are usually left aside. In this paper, we address the modal aspects of a SLAM system and show that the design of the modal controller has a strong impact on SLAM performance in particular in terms of robustness against unforeseen events such as sensor failures, perceptual aliasing or kidnapping. We preset a novel taxonomy for the components of a modern SLAM system, investigate their interplay and propose a highly modular architecture of a generic SLAM system using the Unified Modeling LanguageTM(UML) state machine formalism. The result, called SLAM state machine, is compared to the modal controller of several state-of-the-art SLAM systems and evaluated in two experiments. We demonstrate that our state machine handles unforeseen events much more robustly than the state-of-the-art systems

    Documentation & detection of colour changes of bas relieves using close range photogrammetry

    No full text
    The digitization of complex buildings, findings or bas relieves can strongly facilitate the work of archaeologists, mainly for in depth analysis tasks. Notwithstanding, whether new visualization techniques ease the study phase, a classical naked-eye approach for determining changes or surface alteration could bring towards several drawbacks. The research work described in these pages is aimed at providing experts with a workflow for the evaluation of alterations (e.g. color decay or surface alterations), allowing a more rapid and objective monitoring of monuments. More in deep, a pipeline of work has been tested in order to evaluate the color variation between surfaces acquired at different ´epoques. The introduction of reliable tools of change detection in the archaeological domain is needful; in fact, the most widespread practice, among archaeologists and practitioners, is to perform a traditional monitoring of surfaces that is made of three main steps: production of a hand-made map based on a subjective analysis, selection of a sub-set of regions of interest, removal of small portion of surface for in depth analysis conducted in laboratory. To overcome this risky and time consuming process, digital automatic change detection procedure represents a turning point. To do so, automatic classification has been carried out according to two approaches: a pixel-based and an object-based method. Pixel-based classification aims to identify the classes by means of the spectral information provided by each pixel belonging to the original bands. The object-based approach operates on sets of pixels (objects/regions) grouped together by means of an image segmentation technique. The methodology was tested by studying the bas-relieves of a temple located in Peru, named Huaca de la Luna. Despite the data sources were collected with unplanned surveys, the workflow proved to be a valuable solution useful to understand which are the main changes over time

    Scintigraphic load of bone disease evaluated by DASciS software as a survival predictor in metastatic castration-resistant prostate cancer patients candidates to 223RaCl treatment

    No full text
    Background Aim of our study was to assess the load of bone disease at starting and during Ra-223 treatment as an overall survival (OS) predictor in metastatic castration-resistant prostate cancer (mCRPC) patients. Bone scan index (BSI) is defined as the percentage of total amount of bone metastasis on whole-body scintigraphic images. We present a specific software (DASciS) developed by an engineering team of "Sapienza" University of Rome for BSI calculation. Patients and methods 127 mCRPC patients bone scan images were processed with DASciS software, and BSI was tested as OS predictor. Results 546 bone scans were analyzed revealing that the extension of disease is a predictor of OS (0-3% = 28 months of median survival (MoMS]; 3%-5% = 11 MoMS, > 5% = 5 MoMS). BSI has been analyzed as a single parameter for OS, determining an 88% AUC. Moreover, the composition between the BSI and the 3-PS (3-variable prognostic score) determines a remarkable improvement of the AUC (91%), defining these two parameters as the best OS predictors. Conclusions This study suggests that OS is inversely correlated with the load of bone disease in mCRPC Ra-223-treated subjects. DASciS software appears a promising tool in identifying mCRPC patients that more likely take advantage from Ra-223 treatment. BSI is proposed as a predictive variable for OS and included to a multidimensional clinical evaluation permits to approach the patients' enrollment in a rational way, allowing to enhance the treatment effectiveness together with cost optimization

    Scintigraphic load of bone disease evaluated by DASciS software as a survival predictor in metastatic castration-resistant prostate cancer patients candidates to 223RaCl treatment

    No full text
    Aim of our study was to assess the load of bone disease at starting and during Ra-223 treatment as an overall survival (OS) predictor in metastatic castration-resistant prostate cancer (mCRPC) patients. Bone scan index (BSI) is defined as the percentage of total amount of bone metastasis on whole-body scintigraphic images. We present a specific software (DASciS) developed by an engineering team of “Sapienza” University of Rome for BSI calculation
    corecore